Dataset statistics
| Number of variables | 42 |
|---|---|
| Number of observations | 750000 |
| Missing cells | 299345 |
| Missing cells (%) | 1.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 240.3 MiB |
| Average record size in memory | 336.0 B |
Variable types
| NUM | 27 |
|---|---|
| CAT | 12 |
| BOOL | 3 |
policy_code has constant value "750000" | Constant |
issue_date has a high cardinality: 139 distinct values | High cardinality |
earlies_credit_mon has a high cardinality: 718 distinct values | High cardinality |
monthly_payment is highly correlated with total_loan | High correlation |
total_loan is highly correlated with monthly_payment | High correlation |
scoring_high is highly correlated with scoring_low | High correlation |
scoring_low is highly correlated with scoring_high | High correlation |
early_return_amount_3mon is highly correlated with early_return_amount | High correlation |
early_return_amount is highly correlated with early_return_amount_3mon | High correlation |
sub_class is highly correlated with class | High correlation |
class is highly correlated with sub_class | High correlation |
work_year has 43847 (5.8%) missing values | Missing |
f0 has 37798 (5.0%) missing values | Missing |
f1 has 65411 (8.7%) missing values | Missing |
f2 has 37798 (5.0%) missing values | Missing |
f3 has 37799 (5.0%) missing values | Missing |
f4 has 37798 (5.0%) missing values | Missing |
f5 has 37798 (5.0%) missing values | Missing |
debt_loan_ratio is highly skewed (γ1 = 27.21025421) | Skewed |
f1 is highly skewed (γ1 = 42.0206785) | Skewed |
loan_id has unique values | Unique |
user_id has unique values | Unique |
house_exist has 370915 (49.5%) zeros | Zeros |
offsprings has 491460 (65.5%) zeros | Zeros |
use has 435098 (58.0%) zeros | Zeros |
region has 25461 (3.4%) zeros | Zeros |
del_in_18month has 605314 (80.7%) zeros | Zeros |
pub_dero_bankrup has 656470 (87.5%) zeros | Zeros |
early_return has 690074 (92.0%) zeros | Zeros |
early_return_amount has 690074 (92.0%) zeros | Zeros |
early_return_amount_3mon has 690074 (92.0%) zeros | Zeros |
title has 368735 (49.2%) zeros | Zeros |
f1 has 684067 (91.2%) zeros | Zeros |
f2 has 21233 (2.8%) zeros | Zeros |
f5 has 543147 (72.4%) zeros | Zeros |
Reproduction
| Analysis started | 2021-10-15 01:37:53.053074 |
|---|---|
| Analysis finished | 2021-10-15 01:44:13.468313 |
| Duration | 6 minutes and 20.42 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 750000 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 400033.7122 |
|---|---|
| Minimum | 0 |
| Maximum | 799999 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 39991.95 |
| Q1 | 199923.75 |
| median | 400077.5 |
| Q3 | 600051.25 |
| 95-th percentile | 760060.05 |
| Maximum | 799999 |
| Range | 799999 |
| Interquartile range (IQR) | 400127.5 |
Descriptive statistics
| Standard deviation | 230968.5423 |
|---|---|
| Coefficient of variation (CV) | 0.5773726944 |
| Kurtosis | -1.200399316 |
| Mean | 400033.7122 |
| Median Absolute Deviation (MAD) | 200064 |
| Skewness | -0.0002873238239 |
| Sum | 3.000252841e+11 |
| Variance | 5.334646751e+10 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 537171 | 1 | < 0.1% | |
| 524885 | 1 | < 0.1% | |
| 531030 | 1 | < 0.1% | |
| 528983 | 1 | < 0.1% | |
| 551512 | 1 | < 0.1% | |
| 555610 | 1 | < 0.1% | |
| 553563 | 1 | < 0.1% | |
| 543324 | 1 | < 0.1% | |
| 547422 | 1 | < 0.1% | |
| Other values (749990) | 749990 | > 99.9% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 1 | 1 | < 0.1% | |
| 2 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 799999 | 1 | < 0.1% | |
| 799998 | 1 | < 0.1% | |
| 799996 | 1 | < 0.1% | |
| 799994 | 1 | < 0.1% | |
| 799993 | 1 | < 0.1% |
| Distinct | 750000 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 374999.5 |
|---|---|
| Minimum | 0 |
| Maximum | 749999 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 37499.95 |
| Q1 | 187499.75 |
| median | 374999.5 |
| Q3 | 562499.25 |
| 95-th percentile | 712499.05 |
| Maximum | 749999 |
| Range | 749999 |
| Interquartile range (IQR) | 374999.5 |
Descriptive statistics
| Standard deviation | 216506.4953 |
|---|---|
| Coefficient of variation (CV) | 0.5773514239 |
| Kurtosis | -1.2 |
| Mean | 374999.5 |
| Median Absolute Deviation (MAD) | 187500 |
| Skewness | -1.963100175e-15 |
| Sum | 2.81249625e+11 |
| Variance | 4.68750625e+10 |
| Monotocity | Strictly increasing |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 152351 | 1 | < 0.1% | |
| 238369 | 1 | < 0.1% | |
| 244514 | 1 | < 0.1% | |
| 242467 | 1 | < 0.1% | |
| 232228 | 1 | < 0.1% | |
| 230181 | 1 | < 0.1% | |
| 236326 | 1 | < 0.1% | |
| 234279 | 1 | < 0.1% | |
| 256808 | 1 | < 0.1% | |
| Other values (749990) | 749990 | > 99.9% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 1 | 1 | < 0.1% | |
| 2 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 749999 | 1 | < 0.1% | |
| 749998 | 1 | < 0.1% | |
| 749997 | 1 | < 0.1% | |
| 749996 | 1 | < 0.1% | |
| 749995 | 1 | < 0.1% |
| Distinct | 1540 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 14419.40653 |
|---|---|
| Minimum | 500 |
| Maximum | 40000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 500 |
|---|---|
| 5-th percentile | 3200 |
| Q1 | 8000 |
| median | 12000 |
| Q3 | 20000 |
| 95-th percentile | 32875 |
| Maximum | 40000 |
| Range | 39500 |
| Interquartile range (IQR) | 12000 |
Descriptive statistics
| Standard deviation | 8717.343741 |
|---|---|
| Coefficient of variation (CV) | 0.6045563471 |
| Kurtosis | -0.0843171787 |
| Mean | 14419.40653 |
| Median Absolute Deviation (MAD) | 6000 |
| Skewness | 0.7823678844 |
| Sum | 1.08145549e+10 |
| Variance | 75992081.9 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 10000 | 55345 | 7.4% | |
| 12000 | 40643 | 5.4% | |
| 20000 | 39334 | 5.2% | |
| 15000 | 39054 | 5.2% | |
| 35000 | 28511 | 3.8% | |
| 5000 | 27372 | 3.6% | |
| 8000 | 26386 | 3.5% | |
| 6000 | 24976 | 3.3% | |
| 16000 | 20354 | 2.7% | |
| 25000 | 19067 | 2.5% | |
| Other values (1530) | 428958 | 57.2% |
| Value | Count | Frequency (%) | |
| 500 | 2 | < 0.1% | |
| 700 | 1 | < 0.1% | |
| 725 | 1 | < 0.1% | |
| 750 | 1 | < 0.1% | |
| 900 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 40000 | 3773 | 0.5% | |
| 39975 | 3 | < 0.1% | |
| 39950 | 1 | < 0.1% | |
| 39900 | 4 | < 0.1% | |
| 39875 | 1 | < 0.1% |
year_of_loan
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.7 MiB |
| 3 | |
|---|---|
| 5 |
| Value | Count | Frequency (%) | |
| 3 | 568909 | 75.9% | |
| 5 | 181091 | 24.1% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
interest
Real number (ℝ≥0)
| Distinct | 639 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.24006247 |
|---|---|
| Minimum | 5.31 |
| Maximum | 30.99 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 5.31 |
|---|---|
| 5-th percentile | 6.49 |
| Q1 | 9.75 |
| median | 12.74 |
| Q3 | 15.99 |
| 95-th percentile | 22.15 |
| Maximum | 30.99 |
| Range | 25.68 |
| Interquartile range (IQR) | 6.24 |
Descriptive statistics
| Standard deviation | 4.767528193 |
|---|---|
| Coefficient of variation (CV) | 0.3600835121 |
| Kurtosis | 0.5024593185 |
| Mean | 13.24006247 |
| Median Absolute Deviation (MAD) | 3.14 |
| Skewness | 0.7127235967 |
| Sum | 9930046.85 |
| Variance | 22.72932507 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 10.99 | 21197 | 2.8% | |
| 11.99 | 18437 | 2.5% | |
| 13.99 | 16368 | 2.2% | |
| 5.32 | 15968 | 2.1% | |
| 9.17 | 14507 | 1.9% | |
| 12.99 | 14312 | 1.9% | |
| 7.89 | 13845 | 1.8% | |
| 16.99 | 12719 | 1.7% | |
| 15.61 | 12597 | 1.7% | |
| 9.99 | 10875 | 1.5% | |
| Other values (629) | 599175 | 79.9% |
| Value | Count | Frequency (%) | |
| 5.31 | 515 | 0.1% | |
| 5.32 | 15968 | 2.1% | |
| 5.42 | 304 | < 0.1% | |
| 5.79 | 215 | < 0.1% | |
| 5.93 | 1017 | 0.1% |
| Value | Count | Frequency (%) | |
| 30.99 | 278 | < 0.1% | |
| 30.94 | 258 | < 0.1% | |
| 30.89 | 229 | < 0.1% | |
| 30.84 | 246 | < 0.1% | |
| 30.79 | 425 | 0.1% |
| Distinct | 70941 |
|---|---|
| Distinct (%) | 9.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 438.0329199 |
|---|---|
| Minimum | 15.69 |
| Maximum | 1715.42 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 15.69 |
|---|---|
| 5-th percentile | 108.74 |
| Q1 | 248.45 |
| median | 375.16 |
| Q3 | 580.73 |
| 95-th percentile | 963.68 |
| Maximum | 1715.42 |
| Range | 1699.73 |
| Interquartile range (IQR) | 332.28 |
Descriptive statistics
| Standard deviation | 261.5134068 |
|---|---|
| Coefficient of variation (CV) | 0.5970177009 |
| Kurtosis | 0.7452647872 |
| Mean | 438.0329199 |
| Median Absolute Deviation (MAD) | 155.74 |
| Skewness | 1.006014006 |
| Sum | 328524689.9 |
| Variance | 68389.26191 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 327.34 | 1741 | 0.2% | |
| 301.15 | 1492 | 0.2% | |
| 332.1 | 1437 | 0.2% | |
| 318.79 | 1257 | 0.2% | |
| 491.01 | 1189 | 0.2% | |
| 312.86 | 1179 | 0.2% | |
| 392.81 | 1124 | 0.1% | |
| 602.3 | 1124 | 0.1% | |
| 361.38 | 1080 | 0.1% | |
| 451.73 | 1059 | 0.1% | |
| Other values (70931) | 737318 | 98.3% |
| Value | Count | Frequency (%) | |
| 15.69 | 1 | < 0.1% | |
| 16.31 | 1 | < 0.1% | |
| 19.87 | 1 | < 0.1% | |
| 20.22 | 1 | < 0.1% | |
| 21.25 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1715.42 | 1 | < 0.1% | |
| 1714.54 | 4 | < 0.1% | |
| 1691.28 | 1 | < 0.1% | |
| 1647.03 | 1 | < 0.1% | |
| 1618.03 | 1 | < 0.1% |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.7 MiB |
| B | |
|---|---|
| C | |
| A | |
| D | |
| E | |
| Other values (2) |
| Value | Count | Frequency (%) | |
| B | 219124 | 29.2% | |
| C | 212817 | 28.4% | |
| A | 130885 | 17.5% | |
| D | 112021 | 14.9% | |
| E | 52245 | 7.0% | |
| F | 17861 | 2.4% | |
| G | 5047 | 0.7% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
| Distinct | 35 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.7 MiB |
| C1 | 47515 |
|---|---|
| B4 | 46366 |
| B5 | 45896 |
| B3 | 45626 |
| C2 | 44065 |
| Other values (30) |
| Value | Count | Frequency (%) | |
| C1 | 47515 | 6.3% | |
| B4 | 46366 | 6.2% | |
| B5 | 45896 | 6.1% | |
| B3 | 45626 | 6.1% | |
| C2 | 44065 | 5.9% | |
| C3 | 42000 | 5.6% | |
| C4 | 41524 | 5.5% | |
| B2 | 41498 | 5.5% | |
| B1 | 39738 | 5.3% | |
| C5 | 37713 | 5.0% | |
| Other values (25) | 318059 | 42.4% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
work_type
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.7 MiB |
| 其他 | |
|---|---|
| 职员 | |
| 工人 | |
| 公务员 | |
| 工程师 |
| Value | Count | Frequency (%) | |
| 其他 | 273439 | 36.5% | |
| 职员 | 166548 | 22.2% | |
| 工人 | 125219 | 16.7% | |
| 公务员 | 101378 | 13.5% | |
| 工程师 | 83416 | 11.1% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 3 |
|---|---|
| Median length | 2 |
| Mean length | 2.246392 |
| Min length | 2 |
employer_type
Categorical
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.7 MiB |
| 普通企业 | |
|---|---|
| 政府机构 | |
| 幼教与中小学校 | |
| 上市企业 | |
| 世界五百强 |
| Value | Count | Frequency (%) | |
| 普通企业 | 340530 | 45.4% | |
| 政府机构 | 193688 | 25.8% | |
| 幼教与中小学校 | 74991 | 10.0% | |
| 上市企业 | 74968 | 10.0% | |
| 世界五百强 | 40538 | 5.4% | |
| 高等教育机构 | 25285 | 3.4% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 7 |
|---|---|
| Median length | 4 |
| Mean length | 4.421441333 |
| Min length | 4 |
industry
Categorical
| Distinct | 14 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.7 MiB |
| 金融业 | |
|---|---|
| 电力、热力生产供应业 | |
| 公共服务、社会组织 | |
| 住宿和餐饮业 | |
| 文化和体育业 | |
| Other values (9) |
| Value | Count | Frequency (%) | |
| 金融业 | 120311 | 16.0% | |
| 电力、热力生产供应业 | 90212 | 12.0% | |
| 公共服务、社会组织 | 75704 | 10.1% | |
| 住宿和餐饮业 | 67086 | 8.9% | |
| 文化和体育业 | 60397 | 8.1% | |
| 信息传输、软件和信息技术服务业 | 59796 | 8.0% | |
| 建筑业 | 52312 | 7.0% | |
| 房地产业 | 44894 | 6.0% | |
| 交通运输、仓储和邮政业 | 37525 | 5.0% | |
| 农、林、牧、渔业 | 37407 | 5.0% | |
| Other values (4) | 104356 | 13.9% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 15 |
|---|---|
| Median length | 6 |
| Mean length | 6.743008 |
| Min length | 3 |
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 43847 |
| Missing (%) | 5.8% |
| Memory size | 5.7 MiB |
| 10+ years | |
|---|---|
| 2 years | |
| < 1 year | |
| 3 years | |
| 1 year | |
| Other values (6) |
| Value | Count | Frequency (%) | |
| 10+ years | 246226 | 32.8% | |
| 2 years | 67987 | 9.1% | |
| < 1 year | 60198 | 8.0% | |
| 3 years | 60128 | 8.0% | |
| 1 year | 49204 | 6.6% | |
| 5 years | 47027 | 6.3% | |
| 4 years | 45037 | 6.0% | |
| 6 years | 34910 | 4.7% | |
| 8 years | 33857 | 4.5% | |
| 7 years | 33200 | 4.4% | |
| (Missing) | 43847 | 5.8% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 9 |
|---|---|
| Median length | 7 |
| Mean length | 7.437410667 |
| Min length | 3 |
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.6142746667 |
|---|---|
| Minimum | 0 |
| Maximum | 5 |
| Zeros | 370915 |
| Zeros (%) | 49.5% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 2 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.675668984 |
|---|---|
| Coefficient of variation (CV) | 1.099946035 |
| Kurtosis | -0.4692585445 |
| Mean | 0.6142746667 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.6757277834 |
| Sum | 460706 |
| Variance | 0.4565285759 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 370915 | 49.5% | |
| 1 | 297928 | 39.7% | |
| 2 | 80876 | 10.8% | |
| 3 | 175 | < 0.1% | |
| 5 | 77 | < 0.1% | |
| 4 | 29 | < 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 370915 | 49.5% | |
| 1 | 297928 | 39.7% | |
| 2 | 80876 | 10.8% | |
| 3 | 175 | < 0.1% | |
| 4 | 29 | < 0.1% |
| Value | Count | Frequency (%) | |
| 5 | 77 | < 0.1% | |
| 4 | 29 | < 0.1% | |
| 3 | 175 | < 0.1% | |
| 2 | 80876 | 10.8% | |
| 1 | 297928 | 39.7% |
house_loan_status
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.7 MiB |
| 0 | |
|---|---|
| 1 | |
| 2 |
| Value | Count | Frequency (%) | |
| 0 | 384838 | 51.3% | |
| 1 | 284190 | 37.9% | |
| 2 | 80972 | 10.8% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
censor_status
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.7 MiB |
| 1 | |
|---|---|
| 2 | |
| 0 |
| Value | Count | Frequency (%) | |
| 1 | 290428 | 38.7% | |
| 2 | 233440 | 31.1% | |
| 0 | 226132 | 30.2% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
marriage
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.7 MiB |
| 0 | |
|---|---|
| 1 | |
| 2 | 26260 |
| 3 | 6694 |
| Value | Count | Frequency (%) | |
| 0 | 380175 | 50.7% | |
| 1 | 336871 | 44.9% | |
| 2 | 26260 | 3.5% | |
| 3 | 6694 | 0.9% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.082874667 |
|---|---|
| Minimum | 0 |
| Maximum | 5 |
| Zeros | 491460 |
| Zeros (%) | 65.5% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 2 |
| 95-th percentile | 5 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.691962058 |
|---|---|
| Coefficient of variation (CV) | 1.562472658 |
| Kurtosis | 0.06793541089 |
| Mean | 1.082874667 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.251490583 |
| Sum | 812156 |
| Variance | 2.862735607 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 491460 | 65.5% | |
| 3 | 73892 | 9.9% | |
| 5 | 61473 | 8.2% | |
| 2 | 49108 | 6.5% | |
| 1 | 37123 | 4.9% | |
| 4 | 36944 | 4.9% |
| Value | Count | Frequency (%) | |
| 0 | 491460 | 65.5% | |
| 1 | 37123 | 4.9% | |
| 2 | 49108 | 6.5% | |
| 3 | 73892 | 9.9% | |
| 4 | 36944 | 4.9% |
| Value | Count | Frequency (%) | |
| 5 | 61473 | 8.2% | |
| 4 | 36944 | 4.9% | |
| 3 | 73892 | 9.9% | |
| 2 | 49108 | 6.5% | |
| 1 | 37123 | 4.9% |
| Distinct | 139 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.7 MiB |
| 2016-03-01 | 27230 |
|---|---|
| 2015-10-01 | 23885 |
| 2015-07-01 | 23016 |
| 2015-12-01 | 21727 |
| 2014-10-01 | 20136 |
| Other values (134) |
| Value | Count | Frequency (%) | |
| 2016-03-01 | 27230 | 3.6% | |
| 2015-10-01 | 23885 | 3.2% | |
| 2015-07-01 | 23016 | 3.1% | |
| 2015-12-01 | 21727 | 2.9% | |
| 2014-10-01 | 20136 | 2.7% | |
| 2016-02-01 | 19250 | 2.6% | |
| 2015-11-01 | 18238 | 2.4% | |
| 2015-01-01 | 18065 | 2.4% | |
| 2015-04-01 | 17789 | 2.4% | |
| 2015-08-01 | 17596 | 2.3% | |
| Other values (129) | 543068 | 72.4% |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
| Distinct | 14 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.745846667 |
|---|---|
| Minimum | 0 |
| Maximum | 13 |
| Zeros | 435098 |
| Zeros (%) | 58.0% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 4 |
| 95-th percentile | 5 |
| Maximum | 13 |
| Range | 13 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 2.367289073 |
|---|---|
| Coefficient of variation (CV) | 1.355954746 |
| Kurtosis | 1.250989651 |
| Mean | 1.745846667 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.25109235 |
| Sum | 1309385 |
| Variance | 5.604057555 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 435098 | 58.0% | |
| 4 | 164467 | 21.9% | |
| 2 | 48896 | 6.5% | |
| 5 | 43384 | 5.8% | |
| 3 | 16465 | 2.2% | |
| 9 | 8666 | 1.2% | |
| 1 | 8532 | 1.1% | |
| 8 | 8126 | 1.1% | |
| 10 | 5280 | 0.7% | |
| 7 | 5039 | 0.7% | |
| Other values (4) | 6047 | 0.8% |
| Value | Count | Frequency (%) | |
| 0 | 435098 | 58.0% | |
| 1 | 8532 | 1.1% | |
| 2 | 48896 | 6.5% | |
| 3 | 16465 | 2.2% | |
| 4 | 164467 | 21.9% |
| Value | Count | Frequency (%) | |
| 13 | 182 | < 0.1% | |
| 12 | 1272 | 0.2% | |
| 11 | 523 | 0.1% | |
| 10 | 5280 | 0.7% | |
| 9 | 8666 | 1.2% |
post_code
Real number (ℝ≥0)
| Distinct | 930 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 1 |
| Missing (%) | < 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 258.5588661 |
|---|---|
| Minimum | 0 |
| Maximum | 940 |
| Zeros | 2425 |
| Zeros (%) | 0.3% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 19 |
| Q1 | 103 |
| median | 203 |
| Q3 | 395 |
| 95-th percentile | 654 |
| Maximum | 940 |
| Range | 940 |
| Interquartile range (IQR) | 292 |
Descriptive statistics
| Standard deviation | 200.0489484 |
|---|---|
| Coefficient of variation (CV) | 0.7737075562 |
| Kurtosis | -0.1433229611 |
| Mean | 258.5588661 |
| Median Absolute Deviation (MAD) | 131 |
| Skewness | 0.8320427967 |
| Sum | 193918891 |
| Variance | 40019.58176 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 134 | 8402 | 1.1% | |
| 19 | 8042 | 1.1% | |
| 51 | 7659 | 1.0% | |
| 31 | 6873 | 0.9% | |
| 4 | 6715 | 0.9% | |
| 32 | 6269 | 0.8% | |
| 143 | 5967 | 0.8% | |
| 74 | 5956 | 0.8% | |
| 195 | 5877 | 0.8% | |
| 116 | 5869 | 0.8% | |
| Other values (920) | 682370 | 91.0% |
| Value | Count | Frequency (%) | |
| 0 | 2425 | 0.3% | |
| 1 | 368 | < 0.1% | |
| 2 | 2720 | 0.4% | |
| 3 | 352 | < 0.1% | |
| 4 | 6715 | 0.9% |
| Value | Count | Frequency (%) | |
| 940 | 1 | < 0.1% | |
| 938 | 1 | < 0.1% | |
| 937 | 1 | < 0.1% | |
| 935 | 1 | < 0.1% | |
| 933 | 1 | < 0.1% |
| Distinct | 51 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 16.388876 |
|---|---|
| Minimum | 0 |
| Maximum | 50 |
| Zeros | 25461 |
| Zeros (%) | 3.4% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 8 |
| median | 14 |
| Q3 | 22 |
| 95-th percentile | 38 |
| Maximum | 50 |
| Range | 50 |
| Interquartile range (IQR) | 14 |
Descriptive statistics
| Standard deviation | 11.03569641 |
|---|---|
| Coefficient of variation (CV) | 0.673365056 |
| Kurtosis | -0.02959829805 |
| Mean | 16.388876 |
| Median Absolute Deviation (MAD) | 7 |
| Skewness | 0.7821439763 |
| Sum | 12291657 |
| Variance | 121.7865952 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 8 | 109597 | 14.6% | |
| 14 | 61683 | 8.2% | |
| 13 | 60995 | 8.1% | |
| 21 | 53295 | 7.1% | |
| 2 | 28601 | 3.8% | |
| 30 | 26873 | 3.6% | |
| 0 | 25461 | 3.4% | |
| 19 | 24543 | 3.3% | |
| 3 | 24141 | 3.2% | |
| 9 | 21477 | 2.9% | |
| Other values (41) | 313334 | 41.8% |
| Value | Count | Frequency (%) | |
| 0 | 25461 | 3.4% | |
| 1 | 1517 | 0.2% | |
| 2 | 28601 | 3.8% | |
| 3 | 24141 | 3.2% | |
| 4 | 13269 | 1.8% |
| Value | Count | Frequency (%) | |
| 50 | 5 | < 0.1% | |
| 49 | 956 | 0.1% | |
| 48 | 1772 | 0.2% | |
| 47 | 1135 | 0.2% | |
| 46 | 891 | 0.1% |
| Distinct | 6212 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 230 |
| Missing (%) | < 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 18.27978884 |
|---|---|
| Minimum | -1 |
| Maximum | 999 |
| Zeros | 483 |
| Zeros (%) | 0.1% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | 4.98 |
| Q1 | 11.79 |
| median | 17.61 |
| Q3 | 24.05 |
| 95-th percentile | 32.97 |
| Maximum | 999 |
| Range | 1000 |
| Interquartile range (IQR) | 12.26 |
Descriptive statistics
| Standard deviation | 11.13122337 |
|---|---|
| Coefficient of variation (CV) | 0.6089361025 |
| Kurtosis | 2145.887052 |
| Mean | 18.27978884 |
| Median Absolute Deviation (MAD) | 6.11 |
| Skewness | 27.21025421 |
| Sum | 13705637.28 |
| Variance | 123.9041338 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 19.2 | 562 | 0.1% | |
| 18 | 556 | 0.1% | |
| 13.2 | 544 | 0.1% | |
| 16.8 | 530 | 0.1% | |
| 14.4 | 508 | 0.1% | |
| 12 | 498 | 0.1% | |
| 0 | 483 | 0.1% | |
| 21.6 | 478 | 0.1% | |
| 20.4 | 474 | 0.1% | |
| 15.6 | 471 | 0.1% | |
| Other values (6202) | 744666 | 99.3% |
| Value | Count | Frequency (%) | |
| -1 | 2 | < 0.1% | |
| 0 | 483 | 0.1% | |
| 0.01 | 6 | < 0.1% | |
| 0.02 | 8 | < 0.1% | |
| 0.03 | 9 | < 0.1% |
| Value | Count | Frequency (%) | |
| 999 | 22 | < 0.1% | |
| 991.57 | 1 | < 0.1% | |
| 831.97 | 1 | < 0.1% | |
| 818.1 | 1 | < 0.1% | |
| 797.1 | 1 | < 0.1% |
| Distinct | 30 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.3184386667 |
|---|---|
| Minimum | 0 |
| Maximum | 39 |
| Zeros | 605314 |
| Zeros (%) | 80.7% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 2 |
| Maximum | 39 |
| Range | 39 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.8810019961 |
|---|---|
| Coefficient of variation (CV) | 2.766630087 |
| Kurtosis | 63.09156294 |
| Mean | 0.3184386667 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 5.70498389 |
| Sum | 238829 |
| Variance | 0.7761645171 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 605314 | 80.7% | |
| 1 | 96162 | 12.8% | |
| 2 | 28099 | 3.7% | |
| 3 | 10261 | 1.4% | |
| 4 | 4516 | 0.6% | |
| 5 | 2351 | 0.3% | |
| 6 | 1307 | 0.2% | |
| 7 | 718 | 0.1% | |
| 8 | 412 | 0.1% | |
| 9 | 272 | < 0.1% | |
| Other values (20) | 588 | 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 605314 | 80.7% | |
| 1 | 96162 | 12.8% | |
| 2 | 28099 | 3.7% | |
| 3 | 10261 | 1.4% | |
| 4 | 4516 | 0.6% |
| Value | Count | Frequency (%) | |
| 39 | 1 | < 0.1% | |
| 30 | 1 | < 0.1% | |
| 29 | 2 | < 0.1% | |
| 27 | 1 | < 0.1% | |
| 26 | 2 | < 0.1% |
| Distinct | 39 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 696.2036867 |
|---|---|
| Minimum | 630 |
| Maximum | 845 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 630 |
|---|---|
| 5-th percentile | 660 |
| Q1 | 670 |
| median | 690 |
| Q3 | 710 |
| 95-th percentile | 760 |
| Maximum | 845 |
| Range | 215 |
| Interquartile range (IQR) | 40 |
Descriptive statistics
| Standard deviation | 31.87235468 |
|---|---|
| Coefficient of variation (CV) | 0.04578021532 |
| Kurtosis | 1.655420499 |
| Mean | 696.2036867 |
| Median Absolute Deviation (MAD) | 20 |
| Skewness | 1.283386223 |
| Sum | 522152765 |
| Variance | 1015.846993 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 660 | 67466 | 9.0% | |
| 670 | 65559 | 8.7% | |
| 665 | 65241 | 8.7% | |
| 675 | 58196 | 7.8% | |
| 680 | 57375 | 7.6% | |
| 685 | 50286 | 6.7% | |
| 690 | 48804 | 6.5% | |
| 695 | 44107 | 5.9% | |
| 700 | 40558 | 5.4% | |
| 705 | 36618 | 4.9% | |
| Other values (29) | 215790 | 28.8% |
| Value | Count | Frequency (%) | |
| 630 | 1 | < 0.1% | |
| 660 | 67466 | 9.0% | |
| 665 | 65241 | 8.7% | |
| 670 | 65559 | 8.7% | |
| 675 | 58196 | 7.8% |
| Value | Count | Frequency (%) | |
| 845 | 112 | < 0.1% | |
| 840 | 132 | < 0.1% | |
| 835 | 242 | < 0.1% | |
| 830 | 386 | 0.1% | |
| 825 | 597 | 0.1% |
| Distinct | 39 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 700.203836 |
|---|---|
| Minimum | 634 |
| Maximum | 850 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 634 |
|---|---|
| 5-th percentile | 664 |
| Q1 | 674 |
| median | 694 |
| Q3 | 714 |
| 95-th percentile | 764 |
| Maximum | 850 |
| Range | 216 |
| Interquartile range (IQR) | 40 |
Descriptive statistics
| Standard deviation | 31.87305418 |
|---|---|
| Coefficient of variation (CV) | 0.04551967947 |
| Kurtosis | 1.656913873 |
| Mean | 700.203836 |
| Median Absolute Deviation (MAD) | 20 |
| Skewness | 1.283596067 |
| Sum | 525152877 |
| Variance | 1015.891583 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 664 | 67466 | 9.0% | |
| 674 | 65559 | 8.7% | |
| 669 | 65241 | 8.7% | |
| 679 | 58196 | 7.8% | |
| 684 | 57375 | 7.6% | |
| 689 | 50286 | 6.7% | |
| 694 | 48804 | 6.5% | |
| 699 | 44107 | 5.9% | |
| 704 | 40558 | 5.4% | |
| 709 | 36618 | 4.9% | |
| Other values (29) | 215790 | 28.8% |
| Value | Count | Frequency (%) | |
| 634 | 1 | < 0.1% | |
| 664 | 67466 | 9.0% | |
| 669 | 65241 | 8.7% | |
| 674 | 65559 | 8.7% | |
| 679 | 58196 | 7.8% |
| Value | Count | Frequency (%) | |
| 850 | 112 | < 0.1% | |
| 844 | 132 | < 0.1% | |
| 839 | 242 | < 0.1% | |
| 834 | 386 | 0.1% | |
| 829 | 597 | 0.1% |
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 375 |
| Missing (%) | < 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.1339949975 |
|---|---|
| Minimum | 0 |
| Maximum | 12 |
| Zeros | 656470 |
| Zeros (%) | 87.5% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 12 |
| Range | 12 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.3774905176 |
|---|---|
| Coefficient of variation (CV) | 2.817198587 |
| Kurtosis | 20.43401137 |
| Mean | 0.1339949975 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.45945843 |
| Sum | 100446 |
| Variance | 0.1424990909 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 656470 | 87.5% | |
| 1 | 87630 | 11.7% | |
| 2 | 4297 | 0.6% | |
| 3 | 880 | 0.1% | |
| 4 | 231 | < 0.1% | |
| 5 | 76 | < 0.1% | |
| 6 | 23 | < 0.1% | |
| 7 | 11 | < 0.1% | |
| 8 | 3 | < 0.1% | |
| 9 | 3 | < 0.1% | |
| (Missing) | 375 | 0.1% |
| Value | Count | Frequency (%) | |
| 0 | 656470 | 87.5% | |
| 1 | 87630 | 11.7% | |
| 2 | 4297 | 0.6% | |
| 3 | 880 | 0.1% | |
| 4 | 231 | < 0.1% |
| Value | Count | Frequency (%) | |
| 12 | 1 | < 0.1% | |
| 9 | 3 | < 0.1% | |
| 8 | 3 | < 0.1% | |
| 7 | 11 | < 0.1% | |
| 6 | 23 | < 0.1% |
| Distinct | 30 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.197812 |
|---|---|
| Minimum | 0 |
| Maximum | 29 |
| Zeros | 690074 |
| Zeros (%) | 92.0% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 11 |
| Maximum | 29 |
| Range | 29 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 4.703108182 |
|---|---|
| Coefficient of variation (CV) | 3.926415983 |
| Kurtosis | 17.39499443 |
| Mean | 1.197812 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 4.225263246 |
| Sum | 898359 |
| Variance | 22.11922657 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 690074 | 92.0% | |
| 2 | 2164 | 0.3% | |
| 14 | 2144 | 0.3% | |
| 13 | 2140 | 0.3% | |
| 22 | 2128 | 0.3% | |
| 19 | 2108 | 0.3% | |
| 3 | 2106 | 0.3% | |
| 20 | 2103 | 0.3% | |
| 26 | 2096 | 0.3% | |
| 8 | 2090 | 0.3% | |
| Other values (20) | 40847 | 5.4% |
| Value | Count | Frequency (%) | |
| 0 | 690074 | 92.0% | |
| 1 | 2049 | 0.3% | |
| 2 | 2164 | 0.3% | |
| 3 | 2106 | 0.3% | |
| 4 | 2001 | 0.3% |
| Value | Count | Frequency (%) | |
| 29 | 2043 | 0.3% | |
| 28 | 2019 | 0.3% | |
| 27 | 2069 | 0.3% | |
| 26 | 2096 | 0.3% | |
| 25 | 2075 | 0.3% |
| Distinct | 1867 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 95.27616133 |
|---|---|
| Minimum | 0 |
| Maximum | 4321 |
| Zeros | 690074 |
| Zeros (%) | 92.0% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 638 |
| Maximum | 4321 |
| Range | 4321 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 422.2637353 |
|---|---|
| Coefficient of variation (CV) | 4.431997778 |
| Kurtosis | 32.94447865 |
| Mean | 95.27616133 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 5.482002253 |
| Sum | 71457121 |
| Variance | 178306.6622 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 690074 | 92.0% | |
| 240 | 191 | < 0.1% | |
| 336 | 181 | < 0.1% | |
| 360 | 177 | < 0.1% | |
| 720 | 172 | < 0.1% | |
| 252 | 170 | < 0.1% | |
| 504 | 162 | < 0.1% | |
| 420 | 158 | < 0.1% | |
| 270 | 154 | < 0.1% | |
| 288 | 149 | < 0.1% | |
| Other values (1857) | 58412 | 7.8% |
| Value | Count | Frequency (%) | |
| 0 | 690074 | 92.0% | |
| 10 | 12 | < 0.1% | |
| 11 | 12 | < 0.1% | |
| 12 | 16 | < 0.1% | |
| 13 | 15 | < 0.1% |
| Value | Count | Frequency (%) | |
| 4321 | 10 | < 0.1% | |
| 4292 | 16 | < 0.1% | |
| 4263 | 11 | < 0.1% | |
| 4234 | 15 | < 0.1% | |
| 4205 | 16 | < 0.1% |
| Distinct | 59927 |
|---|---|
| Distinct (%) | 8.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 38.04914346 |
|---|---|
| Minimum | 0 |
| Maximum | 2539.717451 |
| Zeros | 690074 |
| Zeros (%) | 92.0% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 237.4731343 |
| Maximum | 2539.717451 |
| Range | 2539.717451 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 175.6506531 |
|---|---|
| Coefficient of variation (CV) | 4.616415432 |
| Kurtosis | 43.88338204 |
| Mean | 38.04914346 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 6.12527588 |
| Sum | 28536857.6 |
| Variance | 30853.15192 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 690074 | 92.0% | |
| 170.6807235 | 1 | < 0.1% | |
| 223.2408119 | 1 | < 0.1% | |
| 970.5881306 | 1 | < 0.1% | |
| 399.7055184 | 1 | < 0.1% | |
| 144.2136159 | 1 | < 0.1% | |
| 99.00719176 | 1 | < 0.1% | |
| 1129.682397 | 1 | < 0.1% | |
| 1117.043613 | 1 | < 0.1% | |
| 719.0778238 | 1 | < 0.1% | |
| Other values (59917) | 59917 | 8.0% |
| Value | Count | Frequency (%) | |
| 0 | 690074 | 92.0% | |
| 2.239298695 | 1 | < 0.1% | |
| 2.41164012 | 1 | < 0.1% | |
| 2.809767768 | 1 | < 0.1% | |
| 2.848548882 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 2539.717451 | 1 | < 0.1% | |
| 2534.695438 | 1 | < 0.1% | |
| 2505.568135 | 1 | < 0.1% | |
| 2488.146224 | 1 | < 0.1% | |
| 2478.778507 | 1 | < 0.1% |
recircle_b
Real number (ℝ≥0)
| Distinct | 69748 |
|---|---|
| Distinct (%) | 9.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 16240.00772 |
|---|---|
| Minimum | 0 |
| Maximum | 2904836 |
| Zeros | 3720 |
| Zeros (%) | 0.5% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1664 |
| Q1 | 5944 |
| median | 11135 |
| Q3 | 19734 |
| 95-th percentile | 43295.05 |
| Maximum | 2904836 |
| Range | 2904836 |
| Interquartile range (IQR) | 13790 |
Descriptive statistics
| Standard deviation | 22578.79737 |
|---|---|
| Coefficient of variation (CV) | 1.390319374 |
| Kurtosis | 1049.546807 |
| Mean | 16240.00772 |
| Median Absolute Deviation (MAD) | 6161 |
| Skewness | 16.29767422 |
| Sum | 1.218000579e+10 |
| Variance | 509802090.9 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 3720 | 0.5% | |
| 4784 | 68 | < 0.1% | |
| 6325 | 65 | < 0.1% | |
| 6018 | 62 | < 0.1% | |
| 5235 | 62 | < 0.1% | |
| 6312 | 60 | < 0.1% | |
| 4886 | 59 | < 0.1% | |
| 5849 | 59 | < 0.1% | |
| 5232 | 59 | < 0.1% | |
| 5833 | 58 | < 0.1% | |
| Other values (69738) | 745728 | 99.4% |
| Value | Count | Frequency (%) | |
| 0 | 3720 | 0.5% | |
| 1 | 40 | < 0.1% | |
| 2 | 51 | < 0.1% | |
| 3 | 49 | < 0.1% | |
| 4 | 45 | < 0.1% |
| Value | Count | Frequency (%) | |
| 2904836 | 1 | < 0.1% | |
| 2568995 | 1 | < 0.1% | |
| 2560703 | 1 | < 0.1% | |
| 1746716 | 1 | < 0.1% | |
| 1696796 | 1 | < 0.1% |
recircle_u
Real number (ℝ≥0)
| Distinct | 1278 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 489 |
| Missing (%) | 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 51.78773902 |
|---|---|
| Minimum | 0 |
| Maximum | 892.3 |
| Zeros | 3931 |
| Zeros (%) | 0.5% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 10.3 |
| Q1 | 33.4 |
| median | 52.1 |
| Q3 | 70.7 |
| 95-th percentile | 91.5 |
| Maximum | 892.3 |
| Range | 892.3 |
| Interquartile range (IQR) | 37.3 |
Descriptive statistics
| Standard deviation | 24.51750678 |
|---|---|
| Coefficient of variation (CV) | 0.4734230003 |
| Kurtosis | 1.039394386 |
| Mean | 51.78773902 |
| Median Absolute Deviation (MAD) | 18.6 |
| Skewness | -0.01530825988 |
| Sum | 38815480.06 |
| Variance | 601.1081389 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 3931 | 0.5% | |
| 57 | 1491 | 0.2% | |
| 61 | 1491 | 0.2% | |
| 53 | 1484 | 0.2% | |
| 55 | 1479 | 0.2% | |
| 59 | 1474 | 0.2% | |
| 58 | 1468 | 0.2% | |
| 46 | 1433 | 0.2% | |
| 54 | 1430 | 0.2% | |
| 62 | 1426 | 0.2% | |
| Other values (1268) | 732404 | 97.7% |
| Value | Count | Frequency (%) | |
| 0 | 3931 | 0.5% | |
| 0.01 | 1 | < 0.1% | |
| 0.1 | 541 | 0.1% | |
| 0.12 | 1 | < 0.1% | |
| 0.2 | 415 | 0.1% |
| Value | Count | Frequency (%) | |
| 892.3 | 1 | < 0.1% | |
| 180.3 | 1 | < 0.1% | |
| 165.8 | 1 | < 0.1% | |
| 162 | 1 | < 0.1% | |
| 156.3 | 1 | < 0.1% |
initial_list_status
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.7 MiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 437116 | 58.3% | |
| 1 | 312884 | 41.7% |
| Distinct | 718 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.7 MiB |
| Aug-2001 | 5215 |
|---|---|
| Aug-2002 | 5065 |
| Sep-2003 | 5050 |
| Oct-2001 | 4925 |
| Sep-2004 | 4909 |
| Other values (713) |
| Value | Count | Frequency (%) | |
| Aug-2001 | 5215 | 0.7% | |
| Aug-2002 | 5065 | 0.7% | |
| Sep-2003 | 5050 | 0.7% | |
| Oct-2001 | 4925 | 0.7% | |
| Sep-2004 | 4909 | 0.7% | |
| Aug-2000 | 4885 | 0.7% | |
| Sep-2002 | 4854 | 0.6% | |
| Aug-2003 | 4790 | 0.6% | |
| Oct-2002 | 4744 | 0.6% | |
| Oct-2000 | 4698 | 0.6% | |
| Other values (708) | 700865 | 93.4% |
Unique
| Unique | 25 ? |
|---|---|
| Unique (%) | < 0.1% |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 8 |
| Min length | 8 |
| Distinct | 37498 |
|---|---|
| Distinct (%) | 5.0% |
| Missing | 1 |
| Missing (%) | < 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1753.21224 |
|---|---|
| Minimum | 0 |
| Maximum | 61680 |
| Zeros | 368735 |
| Zeros (%) | 49.2% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 5 |
| 95-th percentile | 8756.1 |
| Maximum | 61680 |
| Range | 61680 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 7939.716437 |
|---|---|
| Coefficient of variation (CV) | 4.528668153 |
| Kurtosis | 28.00006618 |
| Mean | 1753.21224 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 5.227824819 |
| Sum | 1314907427 |
| Variance | 63039097.11 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 368735 | 49.2% | |
| 4 | 138955 | 18.5% | |
| 5 | 41973 | 5.6% | |
| 6 | 37260 | 5.0% | |
| 3 | 13511 | 1.8% | |
| 2 | 9260 | 1.2% | |
| 38 | 8820 | 1.2% | |
| 10 | 7575 | 1.0% | |
| 1 | 6567 | 0.9% | |
| 9 | 6312 | 0.8% | |
| Other values (37488) | 111031 | 14.8% |
| Value | Count | Frequency (%) | |
| 0 | 368735 | 49.2% | |
| 1 | 6567 | 0.9% | |
| 2 | 9260 | 1.2% | |
| 3 | 13511 | 1.8% | |
| 4 | 138955 | 18.5% |
| Value | Count | Frequency (%) | |
| 61680 | 1 | < 0.1% | |
| 61679 | 1 | < 0.1% | |
| 61678 | 1 | < 0.1% | |
| 61677 | 1 | < 0.1% | |
| 61674 | 1 | < 0.1% |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.7 MiB |
| 1 |
|---|
| Value | Count | Frequency (%) | |
| 1 | 750000 | 100.0% |
| Distinct | 44 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 37798 |
| Missing (%) | 5.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.592246301 |
|---|---|
| Minimum | 0 |
| Maximum | 45 |
| Zeros | 3041 |
| Zeros (%) | 0.4% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 3 |
| median | 5 |
| Q3 | 7 |
| 95-th percentile | 12 |
| Maximum | 45 |
| Range | 45 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 3.216788433 |
|---|---|
| Coefficient of variation (CV) | 0.5752229533 |
| Kurtosis | 3.950298597 |
| Mean | 5.592246301 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 1.433422755 |
| Sum | 3982809 |
| Variance | 10.34772782 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 4 | 110696 | 14.8% | |
| 5 | 102216 | 13.6% | |
| 3 | 100523 | 13.4% | |
| 6 | 83316 | 11.1% | |
| 2 | 66457 | 8.9% | |
| 7 | 64045 | 8.5% | |
| 8 | 47171 | 6.3% | |
| 9 | 33434 | 4.5% | |
| 1 | 24829 | 3.3% | |
| 10 | 23278 | 3.1% | |
| Other values (34) | 56237 | 7.5% | |
| (Missing) | 37798 | 5.0% |
| Value | Count | Frequency (%) | |
| 0 | 3041 | 0.4% | |
| 1 | 24829 | 3.3% | |
| 2 | 66457 | 8.9% | |
| 3 | 100523 | 13.4% | |
| 4 | 110696 | 14.8% |
| Value | Count | Frequency (%) | |
| 45 | 1 | < 0.1% | |
| 44 | 1 | < 0.1% | |
| 43 | 4 | < 0.1% | |
| 42 | 1 | < 0.1% | |
| 39 | 4 | < 0.1% |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 65411 |
| Missing (%) | 8.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.0008048624795 |
|---|---|
| Minimum | 0 |
| Maximum | 4 |
| Zeros | 684067 |
| Zeros (%) | 91.2% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 4 |
| Range | 4 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.03001040155 |
|---|---|
| Coefficient of variation (CV) | 37.28637167 |
| Kurtosis | 2184.171451 |
| Mean | 0.0008048624795 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 42.0206785 |
| Sum | 551 |
| Variance | 0.0009006242014 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 684067 | 91.2% | |
| 1 | 496 | 0.1% | |
| 2 | 24 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% | |
| (Missing) | 65411 | 8.7% |
| Value | Count | Frequency (%) | |
| 0 | 684067 | 91.2% | |
| 1 | 496 | 0.1% | |
| 2 | 24 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 4 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 2 | 24 | < 0.1% | |
| 1 | 496 | 0.1% | |
| 0 | 684067 | 91.2% |
| Distinct | 107 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 37798 |
| Missing (%) | 5.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8.57743028 |
|---|---|
| Minimum | 0 |
| Maximum | 132 |
| Zeros | 21233 |
| Zeros (%) | 2.8% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 4 |
| median | 7 |
| Q3 | 11 |
| 95-th percentile | 23 |
| Maximum | 132 |
| Range | 132 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 7.403955252 |
|---|---|
| Coefficient of variation (CV) | 0.8631903741 |
| Kurtosis | 7.555119837 |
| Mean | 8.57743028 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | 2.065202494 |
| Sum | 6108863 |
| Variance | 54.81855337 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 4 | 60548 | 8.1% | |
| 3 | 60179 | 8.0% | |
| 5 | 57924 | 7.7% | |
| 2 | 54403 | 7.3% | |
| 6 | 53602 | 7.1% | |
| 7 | 48078 | 6.4% | |
| 8 | 42974 | 5.7% | |
| 1 | 41379 | 5.5% | |
| 9 | 36969 | 4.9% | |
| 10 | 32142 | 4.3% | |
| Other values (97) | 224004 | 29.9% | |
| (Missing) | 37798 | 5.0% |
| Value | Count | Frequency (%) | |
| 0 | 21233 | 2.8% | |
| 1 | 41379 | 5.5% | |
| 2 | 54403 | 7.3% | |
| 3 | 60179 | 8.0% | |
| 4 | 60548 | 8.1% |
| Value | Count | Frequency (%) | |
| 132 | 1 | < 0.1% | |
| 131 | 1 | < 0.1% | |
| 128 | 1 | < 0.1% | |
| 117 | 1 | < 0.1% | |
| 113 | 1 | < 0.1% |
| Distinct | 102 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 37799 |
| Missing (%) | 5.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 14.62419738 |
|---|---|
| Minimum | 1 |
| Maximum | 128 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 9 |
| median | 13 |
| Q3 | 19 |
| 95-th percentile | 30 |
| Maximum | 128 |
| Range | 127 |
| Interquartile range (IQR) | 10 |
Descriptive statistics
| Standard deviation | 8.125875284 |
|---|---|
| Coefficient of variation (CV) | 0.5556458978 |
| Kurtosis | 3.121521135 |
| Mean | 14.62419738 |
| Median Absolute Deviation (MAD) | 5 |
| Skewness | 1.305720747 |
| Sum | 10415368 |
| Variance | 66.02984914 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 11 | 42069 | 5.6% | |
| 10 | 41938 | 5.6% | |
| 9 | 41086 | 5.5% | |
| 12 | 40657 | 5.4% | |
| 8 | 39394 | 5.3% | |
| 13 | 38534 | 5.1% | |
| 7 | 36422 | 4.9% | |
| 14 | 36136 | 4.8% | |
| 15 | 33849 | 4.5% | |
| 6 | 31739 | 4.2% | |
| Other values (92) | 330377 | 44.1% | |
| (Missing) | 37799 | 5.0% |
| Value | Count | Frequency (%) | |
| 1 | 10 | < 0.1% | |
| 2 | 4906 | 0.7% | |
| 3 | 11302 | 1.5% | |
| 4 | 18712 | 2.5% | |
| 5 | 25566 | 3.4% |
| Value | Count | Frequency (%) | |
| 128 | 1 | < 0.1% | |
| 127 | 1 | < 0.1% | |
| 107 | 1 | < 0.1% | |
| 105 | 2 | < 0.1% | |
| 104 | 2 | < 0.1% |
| Distinct | 65 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 37798 |
| Missing (%) | 5.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8.110185032 |
|---|---|
| Minimum | 0 |
| Maximum | 70 |
| Zeros | 1588 |
| Zeros (%) | 0.2% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 5 |
| median | 7 |
| Q3 | 11 |
| 95-th percentile | 17 |
| Maximum | 70 |
| Range | 70 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 4.800043044 |
|---|---|
| Coefficient of variation (CV) | 0.5918537031 |
| Kurtosis | 3.325677987 |
| Mean | 8.110185032 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 1.339864706 |
| Sum | 5776090 |
| Variance | 23.04041323 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 5 | 71765 | 9.6% | |
| 6 | 70954 | 9.5% | |
| 4 | 66701 | 8.9% | |
| 7 | 66609 | 8.9% | |
| 8 | 60248 | 8.0% | |
| 3 | 52586 | 7.0% | |
| 9 | 52212 | 7.0% | |
| 10 | 44072 | 5.9% | |
| 11 | 36819 | 4.9% | |
| 2 | 33132 | 4.4% | |
| Other values (55) | 157104 | 20.9% | |
| (Missing) | 37798 | 5.0% |
| Value | Count | Frequency (%) | |
| 0 | 1588 | 0.2% | |
| 1 | 12978 | 1.7% | |
| 2 | 33132 | 4.4% | |
| 3 | 52586 | 7.0% | |
| 4 | 66701 | 8.9% |
| Value | Count | Frequency (%) | |
| 70 | 1 | < 0.1% | |
| 68 | 1 | < 0.1% | |
| 66 | 1 | < 0.1% | |
| 64 | 1 | < 0.1% | |
| 63 | 1 | < 0.1% |
| Distinct | 39 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 37798 |
| Missing (%) | 5.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.5118997138 |
|---|---|
| Minimum | 0 |
| Maximum | 51 |
| Zeros | 543147 |
| Zeros (%) | 72.4% |
| Memory size | 5.7 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 3 |
| Maximum | 51 |
| Range | 51 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.332782746 |
|---|---|
| Coefficient of variation (CV) | 2.603601272 |
| Kurtosis | 47.02875005 |
| Mean | 0.5118997138 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 5.116538805 |
| Sum | 364576 |
| Variance | 1.776309848 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 543147 | 72.4% | |
| 1 | 90085 | 12.0% | |
| 2 | 36429 | 4.9% | |
| 3 | 16541 | 2.2% | |
| 4 | 9847 | 1.3% | |
| 5 | 5741 | 0.8% | |
| 6 | 3658 | 0.5% | |
| 7 | 2332 | 0.3% | |
| 8 | 1471 | 0.2% | |
| 9 | 939 | 0.1% | |
| Other values (29) | 2012 | 0.3% | |
| (Missing) | 37798 | 5.0% |
| Value | Count | Frequency (%) | |
| 0 | 543147 | 72.4% | |
| 1 | 90085 | 12.0% | |
| 2 | 36429 | 4.9% | |
| 3 | 16541 | 2.2% | |
| 4 | 9847 | 1.3% |
| Value | Count | Frequency (%) | |
| 51 | 1 | < 0.1% | |
| 39 | 1 | < 0.1% | |
| 38 | 1 | < 0.1% | |
| 35 | 1 | < 0.1% | |
| 34 | 2 | < 0.1% |
is_default
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.7 MiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 600327 | 80.0% | |
| 1 | 149673 | 20.0% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| loan_id | user_id | total_loan | year_of_loan | interest | monthly_payment | class | sub_class | work_type | employer_type | industry | work_year | house_exist | house_loan_status | censor_status | marriage | offsprings | issue_date | use | post_code | region | debt_loan_ratio | del_in_18month | scoring_low | scoring_high | pub_dero_bankrup | early_return | early_return_amount | early_return_amount_3mon | recircle_b | recircle_u | initial_list_status | earlies_credit_mon | title | policy_code | f0 | f1 | f2 | f3 | f4 | f5 | is_default | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 119262 | 0 | 12000.0 | 5 | 11.53 | 264.10 | B | B5 | 职员 | 普通企业 | 采矿业 | NaN | 0 | 0 | 2 | 0 | 0 | 2015-06-01 | 0 | 814.0 | 4 | 5.07 | 1.0 | 670.0 | 674.0 | 1.0 | 0 | 0 | 0.0 | 3855.0 | 23.1 | 0 | Mar-1984 | 0.0 | 1.0 | 1.0 | 0.0 | 8.0 | 17.0 | 8.0 | 1.0 | 1 |
| 1 | 369815 | 1 | 8000.0 | 3 | 13.98 | 273.35 | C | C3 | 其他 | 普通企业 | 国际组织 | 10+ years | 0 | 1 | 2 | 1 | 3 | 2010-10-01 | 2 | 240.0 | 21 | 15.04 | 0.0 | 725.0 | 729.0 | 0.0 | 0 | 0 | 0.0 | 118632.0 | 99.9 | 1 | Jan-1992 | 94.0 | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | 0 |
| 2 | 787833 | 2 | 20000.0 | 5 | 17.99 | 507.76 | D | D2 | 工人 | 上市企业 | 信息传输、软件和信息技术服务业 | 10+ years | 0 | 0 | 1 | 0 | 0 | 2016-08-01 | 0 | 164.0 | 20 | 17.38 | 1.0 | 675.0 | 679.0 | 0.0 | 0 | 0 | 0.0 | 15670.0 | 72.5 | 0 | Oct-1996 | 0.0 | 1.0 | 6.0 | 0.0 | 10.0 | 8.0 | 3.0 | 0.0 | 0 |
| 3 | 671675 | 3 | 10700.0 | 3 | 10.16 | 346.07 | B | B1 | 职员 | 普通企业 | 电力、热力生产供应业 | 2 years | 2 | 0 | 2 | 0 | 0 | 2013-05-01 | 4 | 48.0 | 10 | 27.87 | 0.0 | 710.0 | 714.0 | 0.0 | 0 | 0 | 0.0 | 18859.0 | 78.6 | 0 | Jul-2000 | 41646.0 | 1.0 | 3.0 | 0.0 | 4.0 | 11.0 | 6.0 | 0.0 | 0 |
| 4 | 245160 | 4 | 8000.0 | 3 | 8.24 | 251.58 | B | B1 | 其他 | 政府机构 | 金融业 | 5 years | 1 | 2 | 0 | 0 | 0 | 2017-04-01 | 4 | 122.0 | 9 | 3.47 | 0.0 | 660.0 | 664.0 | 0.0 | 0 | 0 | 0.0 | 8337.0 | 67.8 | 1 | Mar-2000 | 4.0 | 1.0 | 3.0 | 0.0 | 8.0 | 6.0 | 4.0 | 1.0 | 0 |
| 5 | 647107 | 5 | 28000.0 | 3 | 15.59 | 978.74 | C | C5 | 职员 | 幼教与中小学校 | 公共服务、社会组织 | 10+ years | 2 | 0 | 2 | 0 | 0 | 2016-08-01 | 0 | 149.0 | 22 | 24.33 | 0.0 | 680.0 | 684.0 | 0.0 | 0 | 0 | 0.0 | 40727.0 | 88.6 | 0 | May-2002 | 0.0 | 1.0 | 3.0 | 0.0 | 6.0 | 10.0 | 3.0 | 1.0 | 1 |
| 6 | 289151 | 6 | 6000.0 | 3 | 7.89 | 187.72 | A | A5 | 职员 | 政府机构 | 信息传输、软件和信息技术服务业 | 8 years | 0 | 1 | 0 | 0 | 0 | 2015-07-01 | 0 | 634.0 | 32 | 8.43 | 0.0 | 710.0 | 714.0 | 0.0 | 0 | 0 | 0.0 | 724.0 | 14.2 | 1 | Aug-2000 | 0.0 | 1.0 | 3.0 | 0.0 | 13.0 | 5.0 | 5.0 | 1.0 | 0 |
| 7 | 750155 | 7 | 20000.0 | 3 | 12.79 | 671.86 | C | C1 | 工程师 | 上市企业 | 金融业 | 10+ years | 0 | 0 | 2 | 0 | 0 | 2016-07-01 | 0 | 197.0 | 4 | 19.48 | 0.0 | 690.0 | 694.0 | 0.0 | 0 | 0 | 0.0 | 16694.0 | 71.6 | 0 | Oct-2005 | 0.0 | 1.0 | 8.0 | 0.0 | 3.0 | 12.0 | 8.0 | 0.0 | 0 |
| 8 | 387697 | 8 | 9450.0 | 3 | 13.11 | 318.91 | B | B4 | 工人 | 政府机构 | 信息传输、软件和信息技术服务业 | 2 years | 0 | 0 | 0 | 1 | 0 | 2012-08-01 | 4 | 19.0 | 14 | 18.64 | 0.0 | 705.0 | 709.0 | 0.0 | 0 | 0 | 0.0 | 14291.0 | 66.5 | 1 | Apr-2001 | 847.0 | 1.0 | NaN | NaN | NaN | NaN | NaN | NaN | 0 |
| 9 | 186940 | 9 | 4500.0 | 3 | 5.32 | 135.52 | A | A1 | 其他 | 高等教育机构 | 文化和体育业 | 10+ years | 0 | 0 | 0 | 1 | 0 | 2017-02-01 | 9 | 468.0 | 0 | 7.40 | 0.0 | 805.0 | 809.0 | 0.0 | 0 | 0 | 0.0 | 1623.0 | 10.5 | 0 | Sep-1992 | 10.0 | 1.0 | 2.0 | 0.0 | 9.0 | 8.0 | 2.0 | 0.0 | 0 |
Last rows
| loan_id | user_id | total_loan | year_of_loan | interest | monthly_payment | class | sub_class | work_type | employer_type | industry | work_year | house_exist | house_loan_status | censor_status | marriage | offsprings | issue_date | use | post_code | region | debt_loan_ratio | del_in_18month | scoring_low | scoring_high | pub_dero_bankrup | early_return | early_return_amount | early_return_amount_3mon | recircle_b | recircle_u | initial_list_status | earlies_credit_mon | title | policy_code | f0 | f1 | f2 | f3 | f4 | f5 | is_default | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 749990 | 733613 | 749990 | 29175.0 | 3 | 20.80 | 1096.18 | E | E1 | 其他 | 政府机构 | 金融业 | 5 years | 1 | 0 | 2 | 1 | 5 | 2013-08-01 | 1 | 228.0 | 24 | 8.25 | 0.0 | 665.0 | 669.0 | 0.0 | 0 | 0 | 0.000000 | 25477.0 | 81.7 | 1 | Oct-2003 | 35722.0 | 1.0 | 7.0 | 0.0 | 14.0 | 17.0 | 9.0 | 0.0 | 0 |
| 749991 | 575158 | 749991 | 40000.0 | 3 | 7.89 | 1251.43 | A | A5 | 职员 | 普通企业 | 电力、热力生产供应业 | 7 years | 2 | 0 | 2 | 1 | 1 | 2016-05-01 | 0 | 4.0 | 3 | 13.79 | 0.0 | 720.0 | 724.0 | 1.0 | 0 | 0 | 0.000000 | 22494.0 | 19.3 | 0 | Sep-2000 | 0.0 | 1.0 | 4.0 | 0.0 | 12.0 | 20.0 | 11.0 | 0.0 | 0 |
| 749992 | 588000 | 749992 | 10000.0 | 5 | 8.18 | 203.63 | B | B1 | 公务员 | 政府机构 | 金融业 | < 1 year | 1 | 0 | 1 | 1 | 0 | 2015-09-01 | 4 | 275.0 | 8 | 9.53 | 0.0 | 680.0 | 684.0 | 0.0 | 29 | 290 | 122.642036 | 7184.0 | 86.6 | 0 | Sep-2005 | 4.0 | 1.0 | 3.0 | 0.0 | 18.0 | 6.0 | 4.0 | 0.0 | 0 |
| 749993 | 291383 | 749993 | 40000.0 | 5 | 10.90 | 867.71 | B | B4 | 其他 | 政府机构 | 金融业 | 9 years | 2 | 0 | 1 | 1 | 3 | 2018-03-01 | 2 | 248.0 | 11 | 17.42 | 0.0 | 700.0 | 704.0 | 0.0 | 0 | 0 | 0.000000 | 5192.0 | 7.2 | 0 | Dec-2008 | 5.0 | 1.0 | 9.0 | 0.0 | 9.0 | 24.0 | 7.0 | 0.0 | 0 |
| 749994 | 549991 | 749994 | 12000.0 | 3 | 6.49 | 367.74 | A | A2 | 公务员 | 政府机构 | 信息传输、软件和信息技术服务业 | 10+ years | 0 | 0 | 0 | 1 | 0 | 2014-11-01 | 0 | 112.0 | 23 | 19.76 | 0.0 | 710.0 | 714.0 | 0.0 | 0 | 0 | 0.000000 | 23078.0 | 50.5 | 0 | Mar-1994 | 0.0 | 1.0 | 7.0 | 0.0 | 6.0 | 15.0 | 11.0 | 0.0 | 0 |
| 749995 | 624287 | 749995 | 12000.0 | 3 | 11.47 | 395.55 | B | B5 | 职员 | 上市企业 | 文化和体育业 | 4 years | 0 | 0 | 1 | 0 | 0 | 2016-02-01 | 0 | 95.0 | 13 | 21.55 | 0.0 | 665.0 | 669.0 | 0.0 | 0 | 0 | 0.000000 | 9572.0 | 62.2 | 0 | Jun-1995 | 0.0 | 1.0 | 6.0 | 0.0 | 8.0 | 22.0 | 12.0 | 5.0 | 0 |
| 749996 | 427602 | 749996 | 12000.0 | 3 | 6.03 | 365.23 | A | A1 | 工人 | 政府机构 | 住宿和餐饮业 | 8 years | 1 | 0 | 2 | 0 | 0 | 2014-03-01 | 4 | 74.0 | 30 | 4.52 | 0.0 | 770.0 | 774.0 | 0.0 | 6 | 810 | 352.901898 | 14183.0 | 30.5 | 1 | Sep-2001 | 4.0 | 1.0 | 2.0 | 0.0 | 7.0 | 4.0 | 4.0 | 0.0 | 0 |
| 749997 | 206828 | 749997 | 10000.0 | 3 | 15.41 | 348.67 | D | D1 | 职员 | 政府机构 | 住宿和餐饮业 | 8 years | 1 | 2 | 2 | 1 | 0 | 2015-12-01 | 7 | 74.0 | 30 | 17.25 | 0.0 | 665.0 | 669.0 | 0.0 | 0 | 0 | 0.000000 | 9259.0 | 72.9 | 0 | Oct-2008 | 8.0 | 1.0 | 6.0 | 0.0 | 10.0 | 11.0 | 3.0 | 0.0 | 0 |
| 749998 | 293912 | 749998 | 7200.0 | 3 | 9.44 | 230.44 | B | B1 | 其他 | 政府机构 | 信息传输、软件和信息技术服务业 | 10+ years | 2 | 0 | 2 | 3 | 2 | 2017-12-01 | 0 | 134.0 | 8 | 24.85 | 0.0 | 675.0 | 679.0 | 1.0 | 0 | 0 | 0.000000 | 9825.0 | 71.2 | 0 | Apr-2006 | 0.0 | 1.0 | 7.0 | 0.0 | 9.0 | 11.0 | 6.0 | 0.0 | 0 |
| 749999 | 381388 | 749999 | 16000.0 | 5 | 19.22 | 416.99 | D | D4 | 职员 | 幼教与中小学校 | 文化和体育业 | 3 years | 1 | 0 | 1 | 1 | 2 | 2013-12-01 | 5 | 616.0 | 0 | 15.08 | 0.0 | 695.0 | 699.0 | 0.0 | 0 | 0 | 0.000000 | 9019.0 | 78.4 | 1 | Feb-2001 | 6.0 | 1.0 | 7.0 | 0.0 | 9.0 | 13.0 | 6.0 | 0.0 | 0 |